Three Case Studies Using Agglomerative Clustering

نویسندگان

  • Rodrigo C. Camargos
  • Maria do Carmo Nicoletti
چکیده

Finding a data clustering in a data set is a challenging task since algorithms usually depend on the adopted inter-cluster distance as well as the employed definition of cluster diameter. The work described in this paper approaches a well-known agglomerative clustering algorithm named AGNES (Agglomerative Nesting), in regards to its performance on three case studies namely, datasets formed by clusters of different sizes, uneven inter-cluster distances and diameters. Clustering results are evaluated using three well-known indexes, Dunn, Davies-Bouldin and Rand. Results obtained with K-means were used for comparison purposes. The experiments were conducted divided into three case studies. Their results suggest that AGNES and K-means have similar performance as far as identifying clusters with different sizes and inter-cluster distances, however, AGNES obtained the best results when dealing with clusters having both, different sizes and diameters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting parallelism to support scalable hierarchical clustering

A distributed memory parallel version of the group average Hierarchical Agglomerative Clustering algorithm is proposed to enable scaling the document clustering problem to large collections. Using standard message passing operations reduces interprocess communication while maintaining efficient load balancing. In a series of experiments using a subset of a standard TREC test collection, our par...

متن کامل

Agglomerative Hierarchical Clustering using AVL tree in the case of single-linkage clustering method

The hierarchy is often used to infer knowledge from groups of items and relations in varying granularities. Hierarchical clustering algorithms take an input of pairwise data-item similarities and output a hierarchy of the data-items. This paper presents Bidirectional agglomerative hierarchical clustering to create a hierarchy bottom-up, by iteratively merging the closest pair of data-items into...

متن کامل

Agglomerative Clustering Using Asymmetric Similarities

Algorithms of agglomerative hierarchical clustering using asymmetric similarity measures are studied. Two different measures between two clusters are proposed, one of which generalizes the average linkage for symmetric similarity measures. Asymmetric dendrogram representation is considered after foregoing studies. It is proved that the proposed linkage methods for asymmetric measures have no re...

متن کامل

A Relative Approach to Hierarchical Clustering

This paper presents a new approach to agglomerative hierarchical clustering. Classical hierarchical clustering algorithms are based on metrics which only consider the absolute distance between two clusters, merging the pair of clusters with highest absolute similarity. We propose a relative dissimilarity measure, which considers not only the distance between a pair of clusters, but also how dis...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016